Efficient Crowdsourcing for Metadata Generation
نویسنده
چکیده
Rich and correct metadata still plays a central role in accessing data sources in a semantic fashion. However, at the time of content creation it is often virtually impossible to foresee all possible uses of content and to provide all interesting index terms or categorizations. Therefore semantic retrieval techniques have to provide ways of allowing access to data via missing metadata, which is only created when needed, i.e. at query time. Since the creation of most such metadata will to some degree depend on human judgement (either how to create it in a meaningful way or by actually providing it), crowdsourcing techniques have recently raised attention. By incorporating human workers into the query execution process crowd-enabled databases already can facilitate intelligent, social capabilities like completing missing data at query time or performing cognitive operators. Typical examples are ranking tasks, evaluating the correctness of automatically extracted information, or judging the similarity or subjective appeal of images. But for really creating metadata for probably large data sources, the number of crowd-sourced mini-tasks to fill in missing metadata values may often be prohibitively large and the resulting data quality is doubtful. Instead of simple crowd-sourcing to obtain all values individually, in this talk utilizing user-generated data found in the Social Web is discussed By exploiting user ratings semantically meaningful perceptual spaces can be built, i.e. highlycompressed representations of opinions, impressions, and perceptions of large numbers of users. Then, using few training samples obtained by expert crowd sourcing, missing metadata can be extracted automatically from the perceptual space with high quality and at low costs. First experiments show that this approach actually can boost both performance and quality of crowd-enabled databases, while also providing the flexibility to expand schemas in a querydriven fashion. (Dagstuhl Seminar Series Nr 12171. 22. 27.04.12 G. Antoniou, O. Corcho, K. Aberer, E. Simperl, R. Studer: Semantic Data Management)
منابع مشابه
Metadata Enrichment for Automatic Data Entry Based on Relational Data Models
The idea of automatic generation of data entry forms based on data relational models is a common and known idea that has been discussed day by day more than before according to the popularity of agile methods in software development accompanying development of programming tools. One of the requirements of the automation methods, whether in commercial products or the relevant research projects, ...
متن کاملLIA @ MediaEval 2013 Crowdsourcing Task: Metadata or not Metadata? That is a Fashion Question
In this paper, we describe the LIA system proposed for the MediaEval 2013 Crowdsourcing for Social Multimedia task. The aim is to associate an accurate label to an image among multiple noisy labels collected from a crowdsourcing platform. In particular, the task participants have to predict two types of binary labels for each considered image. The first one mentions that an image is truly fashi...
متن کاملVisual Tools for Crowdsourcing Data Validation within the Globeland30 Geoportal
This research aims to investigate the role of visualization of the user generated data that can empower the geoportal of GlobeLand30 produced by NGCC (National Geomatics Center of China). The focus is set on the development of a concept of tools that can extend the Geo-tagging functionality and make use of it for different target groups. The anticipated tools should improve the continuous data ...
متن کاملMetadata Squared: Enhancing Its Usability for Volunteered Geographic Information and the GeoWeb
D. Sui et al. (eds.), Crowdsourcing Geographic Knowledge: Volunteered Geographic Information (VGI) in Theory and Practice, DOI 10.1007/978-94-007-4587-2_4, © Springer Science+Business Media Dordrecht 2013 Abstract The Internet has brought many changes to the way geographic information is created and shared. One aspect that has not changed is metadata. Static spatial data quality descriptions we...
متن کاملCitizen Archivists at Play: Game Design for Gathering Metadata for Cultural Heritage Institutions
In this paper, we detail our design process for the Metadata Games project and discuss a number of design challenges involved in making a “metadata game,” such as incentivizing players to offer accurate information, devising and deploying methods for verifying the accuracy of data, and introducing effective motivations for ensuring high replay potential. We present our “Outlier Design” model fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012